22 research outputs found

    Foundation, Implementation and Evaluation of the MorphoSaurus System: Subword Indexing, Lexical Learning and Word Sense Disambiguation for Medical Cross-Language Information Retrieval

    Get PDF
    Im medizinischen Alltag, zu welchem viel Dokumentations- und Recherchearbeit gehört, ist mittlerweile der überwiegende Teil textuell kodierter Information elektronisch verfügbar. Hiermit kommt der Entwicklung leistungsfähiger Methoden zur effizienten Recherche eine vorrangige Bedeutung zu. Bewertet man die Nützlichkeit gängiger Textretrievalsysteme aus dem Blickwinkel der medizinischen Fachsprache, dann mangelt es ihnen an morphologischer Funktionalität (Flexion, Derivation und Komposition), lexikalisch-semantischer Funktionalität und der Fähigkeit zu einer sprachübergreifenden Analyse großer Dokumentenbestände. In der vorliegenden Promotionsschrift werden die theoretischen Grundlagen des MorphoSaurus-Systems (ein Akronym für Morphem-Thesaurus) behandelt. Dessen methodischer Kern stellt ein um Morpheme der medizinischen Fach- und Laiensprache gruppierter Thesaurus dar, dessen Einträge mittels semantischer Relationen sprachübergreifend verknüpft sind. Darauf aufbauend wird ein Verfahren vorgestellt, welches (komplexe) Wörter in Morpheme segmentiert, die durch sprachunabhängige, konzeptklassenartige Symbole ersetzt werden. Die resultierende Repräsentation ist die Basis für das sprachübergreifende, morphemorientierte Textretrieval. Neben der Kerntechnologie wird eine Methode zur automatischen Akquise von Lexikoneinträgen vorgestellt, wodurch bestehende Morphemlexika um weitere Sprachen ergänzt werden. Die Berücksichtigung sprachübergreifender Phänomene führt im Anschluss zu einem neuartigen Verfahren zur Auflösung von semantischen Ambiguitäten. Die Leistungsfähigkeit des morphemorientierten Textretrievals wird im Rahmen umfangreicher, standardisierter Evaluationen empirisch getestet und gängigen Herangehensweisen gegenübergestellt

    Spatial location and its relevance for terminological inferences in bio-ontologies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>An adequate and expressive ontological representation of biological organisms and their parts requires formal reasoning mechanisms for their relations of physical aggregation and containment.</p> <p>Results</p> <p>We demonstrate that the proposed formalism allows to deal consistently with "role propagation along non-taxonomic hierarchies", a problem which had repeatedly been identified as an intricate reasoning problem in biomedical ontologies.</p> <p>Conclusion</p> <p>The proposed approach seems to be suitable for the redesign of compositional hierarchies in (bio)medical terminology systems which are embedded into the framework of the OBO (Open Biological Ontologies) Relation Ontology and are using knowledge representation languages developed by the Semantic Web community.</p

    Pozícionális gének aktivitásának szerepe az idegsejt-fenotípus meghatározásában = Role of positional genes in the determination of the neuronal phenotype

    Get PDF
    A jellegzetes mintázatban aktiválódó pozícionális gének agyfejlődésben játszott szerepét vizsgálva az alábbi eredményeket nyertük: 1. A korai NE-4C embrionális idegi őssejtek indukálatlan állapotban regionálisan nem elkötelezettek; a neuron-képzés időszakában a regionális gének széles skálája aktiválódik; a sejtekből GABAerg, glutamaterg és szerotonin termelő neuronok is fejlődnek. 2. Az NE-4C sejtvonal transzfekciójával 11 idegi őssejtklónt alapítottunk; ezek a fejlődés minden szakaszában expresszálták az emx2 (antero-dorzális telencephalon) regionális gént; az NE-4Cemx2+ sejtek adhéziós sajátságai megváltoztak és katekolamin termelő neuronokat is képeztek. 3. Idegi őssejtvonalakat izoláltunk embrionális és kifejlett egéragyból új, szintetikus adhezív peptidkonjugátumok alkalmazásával; 4. A különböző agyi régiókból származó idegi őssejtvonalak és a belőlük in vitro fejlődő idegszöveti sejttípusok sok eredetre jellemző sajátságot megőriztek, de a hagyományos regionális gén-mintázat ezt az „emlékezetet” nem tükrözte; 5. Az egyes őssejt-klónok retinoid-érzékenysége és inherens retinsav metabolizmusa eltérő. Igazoltuk, hogy felnőtt agy neurogén régiói magas retinsav-tartalommal bírnak. 6. A kifejlett agyi parenchyma nem nyújt befogadó környeztet az őssejtek számára; a sérült előagyban az ős/progenitor sejtek szaporodnak, de nem differenciálódnak; túlnyomásos O2-kezelés hatására sporadikus idegsejt irányú fejlődés indítható. | Studies on developmental roles of „positional genes” in the formation of regional brain features led to the following results: 1. Early embryonic neural stem cells (NE-4C) are regionally not determined; in neuron-fromation phase, however, many regional genes got activated; GABAergic and glutamatergic neurons developed from the clon. 2. Inserting the Emx2 (antero-dorsal telencephalic) positional gene into NE-4C cells, 11 sub-clones were established, all expressing Emx2 throughout the entire differentiation period. The NE-4Cemx2+ cells displayed altered adhesive characteristics, and could generate catecholamine producing neurons. 3. By using novel synthetic adhesive peptide-conjugates, neural stem/progenitor clones had been established from different ages and regions of the mouse brain. 4. Neural stem/progenitor clones preserved several features characteristic to their origin, while the expression profile of traditional „regional genes” failed to reflect the regional „memory”. 5. The stem/progenitor clones displayed important differences in retinoid sentitivity and metabolism. In the neurogenic zones of the adult brain, enhanced retinoic acid contaent was demonstrated. 6. The adult brain parenhcyma is not permissive for the implanted stem/progenitor cells, regardless of their origin. In cortical lesion sites, the stem/progenitor cells proliferate, but do not differentiate. Hyperbaric O2-treatment was shown to allow sporadic neuronal differentiation

    Formal representation of complex SNOMED CT expressions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Definitory expressions about clinical procedures, findings and diseases constitute a major benefit of a formally founded clinical reference terminology which is ontologically sound and suited for formal reasoning. SNOMED CT claims to support formal reasoning by description-logic based concept definitions.</p> <p>Methods</p> <p>On the basis of formal ontology criteria we analyze complex SNOMED CT concepts, such as "Concussion of Brain with(out) Loss of Consciousness", using alternatively full first order logics and the description logic <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1472-6947-8-S1-S9-i1"><m:semantics><m:mrow><m:mi>ℰ</m:mi><m:mi>ℒ</m:mi></m:mrow><m:annotation encoding="MathType-MTEF"> MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8hmHuKae8NeHWeaaa@37B1@</m:annotation></m:semantics></m:math></inline-formula>.</p> <p>Results</p> <p>Typical complex SNOMED CT concepts, including negations or not, can be expressed in full first-order logics. Negations cannot be properly expressed in the description logic <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1472-6947-8-S1-S9-i1"><m:semantics><m:mrow><m:mi>ℰ</m:mi><m:mi>ℒ</m:mi></m:mrow><m:annotation encoding="MathType-MTEF"> MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8hmHuKae8NeHWeaaa@37B1@</m:annotation></m:semantics></m:math></inline-formula> underlying SNOMED CT. All concepts concepts the meaning of which implies a temporal scope may be subject to diverging interpretations, which are often unclear in SNOMED CT as their contextual determinants are not made explicit.</p> <p>Conclusion</p> <p>The description of complex medical occurrents is ambiguous, as the same situations can be described as (i) a complex occurrent <it>C </it>that has <it>A </it>and <it>B </it>as temporal parts, (ii) a simple occurrent <it>A' </it>defined as a kind of A followed by some <it>B</it>, or (iii) a simple occurrent <it>B' </it>defined as a kind of <it>B </it>preceded by some <it>A</it>. As negative statements in SNOMED CT cannot be exactly represented without a (computationally costly) extension of the set of logical constructors, a solution can be the reification of negative statments (e.g., "Period with no Loss of Consciousness"), or the use of the SNOMED CT context model. However, the interpretation of SNOMED CT context model concepts as description logics axioms is not recommended, because this may entail unintended models.</p

    Optimizing and evaluating the MEDPILOT search engine. Boosting medical information retrieval by using a morpheme thesaurus

    No full text
    This article describes the implementation and evaluation of a computational linguistic approach to improve the quality of the MEDPILOT medical search engine, maintained by the German National Library of Medicine. At the core of the system lies a new type of multilingual dictionary, in which entries are equivalence classes of morphemes, i.e. semantically minimal units. Documents, as well as user queries, are mapped to these language-independent (conceptual) classes on which retrieval operations are performed. Early results of the evaluation have shown that the used language technology has many advantages in medical information retrieval. In combination with up-to-date search software of the linguistic approach leads to more and better results (i.e. relevant hits) for phenomena such as synonyms, translations and linguistic variants (inflection, derivation, word-composition, etc.). Additionally, a normalization of laymen and expert queries can be achieved. The formal structure of user queries as well as the information needs of MEDPILOT users were analyzed in detail. Results and consequences for the development of user centered design and usability improvements of the search engine will be discussed

    Unsupervised Multilingual Word Sense Disambiguation via an Interlingua

    No full text
    www.imbi.uni-freiburg.de/medinf We present an unsupervised method for resolving word sense ambiguities in one language by using statistical evidence assembled from other languages. It is crucial for this approach that texts are mapped into a language-independent interlingual representation. We also show that the coverage and accuracy resulting from multilingual sources outperform analyses where only monolingual training data is taken into account

    ABSTRACT Bootstrapping Dictionaries for Cross-Language Information Retrieval

    No full text
    The bottleneck for dictionary-based cross-language information retrieval is the lack of comprehensive dictionaries, in particular for many different languages. We here introduce a methodology by which multilingual dictionaries (for Spanish and Swedish) emerge automatically from simple seed lexicons. These seed lexicons are automatically generated, by cognate mapping, from (previously manually constructed) Portuguese and German as well as English sources. Lexical and semantic hypotheses are then validated and new ones iteratively generated by making use of co-occurrence patterns of hypothesized translation synonyms in parallel corpora. We evaluate these newly derived dictionaries on a large medical document collection within a cross-language retrieval setting

    Abstract An Integrated, Dual Learner for Grammars and Ontologies

    No full text
    We introduce a dual-use methodology for automating the maintenance and growth of two types of knowledge sources, which are crucial for natural language text understanding — background knowledge of the underlying domain and linguistic knowledge about the lexicon and the grammar of the underlying natural language. A particularity of this approach is that learning occurs simultaneously with the on-going text understanding process. The knowledge assimilation process is centered around the linguistic and conceptual ‘quality ’ of various forms of evidence underlying the generation, assessment and on-going refinement of lexical and concept hypotheses. On the basis of the strength of evidence, hypotheses are ranked according to qualitative plausibility criteria, and the most reasonable ones are selected for assimilation into the already given lexical class hierarchy and domain ontology. Key words: knowledge acquisition, natural language processing, ontology engineering, grammar learning, concept learning
    corecore